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(57)Abstract: 

PROBLEM TO BE SOLVED: To provide an information 
retrieving device in which retrieval can be executed by 
automatically detecting and correcting an erroneously inputted 
keyword phrase. 

SOLUTION: A dictionary 16 for retrieval word correction is a 
dictionary including synonym knowledge for obtaining the 
synonym of a phrase inputted as a keyword, co-occurrence 
word knowledge for obtaining a co-occurrence word, and related 
term knowledge for obtaining a related term or the like. A 
retrieval word correcting part 12 detects an erroneously inputted 
phrase included in a retrieval condition inputted through a 
retrieval condition inputting part 11 by referring to the dictionary 
16 for retrieval word correction. Then, the retrieval word 
correcting part 12 corrects this detected erroneously inputted 
phrase by the dictionary 16 for retrieval word correction, and 
stores the retrieval condition including the corrected phrase in a 
retrieval condition storing part 13. Thus, a retrieving part 14 is 
allowed to execute retrieval using the corrected phrase. 
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CLAIMS 



[Claim(s)] 

[Claim l]An information retrieval device with which information corresponding to a 
search condition constituted including words and phrases inputted as a keyword is 
retrieved, comprising: 

A dictionary for words-and-phrases correction containing at least one or more of similar- 
words knowledge, coincidence word knowledge, and the related term knowledge. 
An erroneous input words-and-phrases detection means by which said dictionary for 
search term correction detects words and phrases by which the erroneous input was 
carried out. 

A words-and-phrases correcting means which corrects words and phrases which said 
erroneous input words-and-phrases detection means detected in said dictionary for search 
term correction. 

[Claim 2]An information retrieval device with which information corresponding to a 
search condition constituted including words and phrases inputted as a keyword is 
retrieved, comprising: 

A dictionary for words-and-phrases selection containing at least one or more of similar- 
words knowledge, coincidence word knowledge, and the related term knowledge. 
A words-and-phrases selection means selected to one of interpretations in said dictionary 
for words-and-phrases selection when said inputted words and phrases are words and 
phrases interpreted by polysemy. 

[Claim 3]The information retrieval device according to claim 1 or 2 which possesses 
further an evaluation input means to input evaluation which shows the degree of 
usefulness of each retrieved information, and a feedback means which analyzes 
evaluation inputted by said evaluation input means, and corrects said words and phrases, 
and is characterized by things. 

[Claim 4]In an information retrieval method which retrieves information corresponding 
to a search condition constituted including words and phrases which were provided with 
a dictionary for words-and-phrases correction containing at least one or more of similar- 
words knowledge, coincidence word knowledge, and the related term knowledge, and 
were inputted as a keyword, An information retrieval method performing a search using a 
search condition which detects words and phrases by which the erroneous input was 
carried out in said dictionary for search term correction, corrects this detected words-and- 
phrases word in said dictionary for words-and-phrases correction, and is built including 
words and phrases after this correction. 

[Claim 5]In an information retrieval method which retrieves information corresponding 
to a search condition constituted including words and phrases which were provided with 
a dictionary for words-and-phrases selection containing at least one or more of similar- 
words knowledge, coincidence word knowledge, and the related term knowledge, and 



were inputted as a keyword, An information retrieval method performing a search after 
selecting to one of interpretations in said dictionary for words-and-phrases selection and 
making this selection result reflect in said search condition, when said inputted words and 
phrases are words and phrases interpreted by polysemy. 

DETAILED DESCRIPTION 



[Detailed Description of the Invention] 

[0001] [Field of the InventionJThis invention relates to the information retrieval device 
and information retrieval method which detect and correct automatically the words and 
phrases by which started the information retrieval device and information retrieval 
method which retrieve the information corresponding to the search condition constituted 
including the words and phrases inputted as a keyword, especially the erroneous input 
was carried out, and perform a search. 

[0002] [Description of the Prior Art] In recent years, the media of information have 
also diversified [ which is increasing steadily ] the quantity of the information which an 
individual can access a text, a picture, a sound, etc. with the spread of a personal 
computer, the Internet, Electronic Libraries, etc. And the demand of the advanced 
information retrieval systems with which only the information searched for out of huge 
information by such a situation is retrieved is increasing. 

[0003]When using the usual information retrieval system, a user constitutes the search 
condition of the form that he inputs one or more search terms, and can understand a 
system, in order to tell a demand of him to a system. A system is a basis of the 
assumption "a user's demand = search condition", and elects and outputs only the 
information by which a search condition is fulfilled out of a retrieval object. 
[0004] However, the assumption "a user's demand = search condition" is not realized in 
practice in many cases. When specifying a search condition via input devices, such as a 
keyboard, a character reader, voice recognition equipment, especially, the search 
condition which was widely different with a user's intention for the erroneous input of a 
search term may be created. For example, when it is going to specify the search term of a 
Chinese character using a kana-kanji conversion system, the homonym which is 
completely unrelated may be inputted by a conversion error. 
[0005]Although detection and correction of an erroneous input are possible by 
performing word dictionary length to all the search terms to the word etc. of the spelling 
which is not possible, Like the homonymic example mentioned above, when right words 
and phrases had been obtained as a word as a result of a word input, with the 
conventional information retrieval system, were able to detect this and it was not able to 
be corrected. For this reason, search results mistaken according to the mistaken search 
condition were obtained, and the case where a user noticed his erroneous input for the 
first time there had arisen. Even if search results mistaken by the erroneous input of the 
search term were obtained, in addition, there was also a case where a user did not notice 
an erroneous input by himself. 

[0006] [Problem(s) to be Solved by the Invention] Thus, in the conventional 
information retrieval system, when the erroneous input of the right words and phrases had 
been carried out as a search term as a word, there was a problem that could detect this 
and it could not be corrected. 



[0007]This invention is made in view of such the actual condition, and is a thing. 
The purpose is to provide the information retrieval device and information retrieval 
method which detect and correct ****** automatically and perform a search. 

[0008] [Means for Solving the ProblemJA dictionary including related term knowledge 
for obtaining coincidence word knowledge and a related term for acquiring similar-words 
knowledge for obtaining similar words of words and phrases inputted, for example as a 
keyword in order that this invention may attain the purpose mentioned above, and a 
coincidence word, etc. is used, After detecting and correcting words and phrases by 
which the erroneous input was carried out, it is made to perform a search. 
[0009]Efficiency and accuracy of search are raised by selecting an interpretation of words 
and phrases interpreted by polysemy using this dictionary. Accuracy of re retrieval is 
raised by analyzing evaluation of search results by a user and feeding back this analysis 
result to detection and correction of words and phrases by which the erroneous input was 
carried out. 

[0010]Thus, even if it is a case where an erroneous input occurs, when a user creates a 
search condition intended originally and performs a search, it makes it possible to provide 
suitable information retrieval environment. 

[001 1] [Embodiment of the Invention] Hereafter, this embodiment of the invention is 
described with reference to Drawings. 

(A 1st embodiment) A 1st embodiment of this invention is described first. The 
composition of the information retrieval system concerning a 1st embodiment is shown in 
drawing 1 . As shown in drawing 1, this information retrieval system 10 consists of the 
search condition input section 1 1, the search term correction part 12, the search condition 
storage parts store 13, the retrieval part 14, and the search-results outputting part 15. Here 
the search condition input section 1 1 to input devices, such as a keyboard, a character 
reader, and voice recognition equipment, the search-results outputting part 15, The search 
condition storage parts store 13 corresponds to output units, such as a display and a 
printer, at main memory, a hard disk drive, etc., respectively, and the retrieval part 14 and 
the search term correction part 12 correspond to the program in which execution control 
is carried out by CPU. 

[0012]The search condition inputted into the search condition input section 1 1 by the user 
is passed to the search term correction part 12, and correction of the words and phrases 
which serve as a keyword if needed, i.e., a search term, is performed. The corrected 
search condition is memorized by the search condition storage parts store 13, and the 
retrieval part 14 retrieves information according to this search condition. Search results 
are outputted to a user by the search-results outputting part 15. 
[0013]The difference between the conventional information retrieval system and this 
information retrieval system 10 is only the point that a search condition is once passed in 
the latter to the search term correction part 12, and the search condition storage parts 
store 13 is passed after that to the search condition inputted into the search condition 
input section 1 1 being passed to the direct retrieval condition storage section 13, in the 
former. Therefore, a search system like the existing throat may be sufficient except 
processing of the search term correction part 12. If it searches when a user specifies one 
or more search terms, the specification method of a search condition, a retrieval object, a 
search method, etc. will not be asked. What kind of language [ besides Japanese and 



English ] may be sufficient as a search term? Hereafter, it explains focusing on operation 
of the search term correction part 12. 

[00 14] An example of the flow of processing of the search term correction part 12 of a 1st 
embodiment is shown in drawing 2 . The search term correction part 12 receives first the 
search condition which the user inputted from the search condition input section 1 1 (Step 
Al), analyzes this, and identifies a search term (Step A2). And the following processings 
are performed about each search term. 

[0015]The similar-words knowledge of a search term, and the coincidence word 
knowledge and related term knowledge of the other search term to which its attention is 
paid now are taken out from the dictionary for search term correction (step A4). Next, it 
is judged using these knowledge whether the search term to which its attention is paid 
now is an erroneous input (step A5). When it judges with it being an erroneous input, a 
search condition is corrected by replacing (Y of step A5), and a search term with the 
above-mentioned similar words (Step A6). 

[0016]Only detection of an erroneous input is performed first, the message of "whether to 
correct" is displayed on a user, and it may be made to correct instead of correcting a 
search term with a perfect automatic interactively. 

[00 17] An example of the similar-words knowledge registered into the dictionary 16 for 
search term correction of a 1st embodiment, coincidence word knowledge, and related 
term knowledge is shown in drawing 3 . According to this 1st embodiment, when "the 
word A" turns into "the word B" by erroneous inputs, such as a kana-kanji conversion 
error and a spelling error, it uses that "the word A" and the "word B" are similar words 
mutually. 

[0018]Two kinds such as homonym knowledge and similar notation word knowledge are 
shown as similar-words knowledge by the example of drawing 3 . - (knowledge a) 
(knowledge d) of drawing 3 is an example of homonym knowledge, and - (knowledge e) 
(knowledge g) is an example of similar notation word knowledge. For example, (the 
knowledge a) shows that both the words of a "place of meeting" and "analysis" have 
reading of a "paddle cough", and it turns out that the notation is alike and it is easy to 
carry out the erroneous input of the word of "leader" and "reader" from (the knowledge 
g). This homonym knowledge can be built, for example using the existing dictionary for 
kana-kanji conversions. Similar notation knowledge can be built by enumerating the 
combination of the word of a single-character difference mechanically, for example, or 
Japanese people spelling and collecting the data of an error etc. 

[0019]According to this 1st embodiment, when "the word A" and the "word B" appear in 
the same document and paragraph, or a sentence, both use that it is a coincidence word 
mutually. When "the word A" and the "word B" are semantically related, both use that it 
is a related term mutually. - (knowledge h) (knowledge 1) of drawing 3 is an example of 
coincidence word knowledge, and - (knowledge m) (knowledge o) is an example of 
related term knowledge. For example, as for "information" and the word of "search", (the 
knowledge j) shows coinciding in many cases, and (the knowledge m) shows that a 
"personal computer" and an "office computer" are closely related words semantically. 
[0020]Coincidence word knowledge can be built, for example using the coincidence data 
of the existing dictionary for kana-kanji conversions.The coincidence word knowledge of 
the above-mentioned [****/ building related term knowledge, for example using the 
information on the brother language (word with the same parent node) in the existing 



thesaurus ] is used, If "the word B" coincides with ""word A" and "the word C" and the 
"word B" coincide, it is possible to build by "the word A" and the plan "it to be a related 
term since word C" appeared in the same context." Concrete constructing methods, such 
as similar-words knowledge, coincidence word knowledge, and related term knowledge, 
may be built by the not main point but what kind of method of this invention. 
[0021]The example of the search condition containing the search term which is passed to 
the search term correction part 12 from the search condition input section 1 1 of a 1st 
embodiment, and by which the word input was carried out is shown in drawing 4 . 
Drawing 4 (1) - (6) is an example of the Boolean search which combines two or more 
search terms by AND, OR, and a NOT-operation child, and constitutes a search 
condition. 

[0022] Drawing 4 (1) is an example which has inputted the kana-kanji conversion as the 
"place of meeting" instead of "analysis" accidentally, when it is going to retrieve 
information including both "natural language" and two words of "analysis." When 
drawing 4's (2)'s trying to perform same search, after inputting a "paddle cough" and a 
hiragana, it is an example become final and conclusive not changed into a Chinese 
character. It is clear that human being tends to make such a mistake, if a word processor 
document etc. are seen. 

[0023]If the (knowledge a) of drawing 3 is used in the case of these examples, it turns out 
that it may be necessary to correct the 2nd search term with "analysis" or a "tea lunch" at 
the search term correction part 12. 

[0024]On the other hand, although there are the (knowledge h) and the (knowledge i) of 
drawing 3 as coincidence word knowledge about the "natural language" which is the 1st 
search term, among these since it is shown that "natural language" and "analysis" 
coincide, as for (the knowledge i), it turns out that the 2nd search term should just correct 
with "analysis." 

[0025] Drawing 4 (3) shows the example of the erroneous input in the Boolean search 
using NOT. The user who wants to retrieve information including the word of "CS" used 
as an abbreviation of "customer satisfaction (Customer Satisfaction)", Since "CS" has a 
meaning of "a communications satellite (communication satellites)", in order to eliminate 
information including the word of "CS" in the latter meaning, it is a case where tended to 
input NOT "satellite" and it keeps as "health" accidentally. In this case, the (knowledge 
b) and the (knowledge k) of drawing 3 show that what is necessary is just to correct 
"health" to a "satellite." 

[0026]Drawjng_4_(l) - (3) was an example which corrects the error of a kana-kanji 
conversion using the homonym knowledge of the synonym knowledge. On the other 
hand, drawing 4 (4) - (5) is an example which corrects an input error using the similar 
notation word knowledge of the synonym knowledge. 

[0027] Drawing 4 (4) is a case where tried to perform Boolean search called "information" 
AND "search", and "criminal investigation" has been inputted instead of "search." In this 
case, since the (knowledge f) of drawing 3 shows that the intended search term may be 
"search" to which the notation resembles "criminal investigation" and, as for 
"information" and "search", the (knowledge j) of drawing 3 shows coinciding on the 
other hand, it turns out that what is necessary is just to correct "criminal investigation" to 
"search." 

[002 8] Although drawing 4 (5) is the same as that of (4), a search term is an example 



which is English. It is a case where tended to retrieve the information about "an optical- 
character-recognition machine (optical character reader)", made a mistake in spelling of 
"reader", and it keeps as "leader." This can be corrected by the (knowledge g) and the 
(knowledge 1) of drawing 3 . 

[0029] Drawing 4 (1) - (5) was an example which performs search term correction using 
similar-words knowledge and coincidence word knowledge. On the other hand, drawing 
4.(6) is an example which performs search term correction using synonym knowledge 
and related term knowledge. 

[0030]Drawjng_4_(6) shows the case where the user who tried to retrieve the information 
about a "broadcasting satellite", "satellite broadcasting", a "communications satellite", 
"satellite communication", etc. using AND and OR has done the erroneous input to the 
"package" instead of "broadcast." In this case, the (knowledge c) of drawing 3 shows that 
the homonym of a "package" has "broadcast", and, on the other hand, it turns out that the 
related term of "broadcast" has "communication" from the (knowledge n) of drawing 3 . 
Here, since a "package" and "communication" were connected with the search condition 
by OR, it turns out that what is necessary is just to correct a "package" to "broadcast." 
[0031] An example of the erroneous input judging algorithm (step A5 of drawing 2 ) of the 
i-th search term in drawing 2 is shown in drawing 5 . Here, a search condition is drawing 
4_(1), and the i-th search term is a "place of meeting", and the case where the dictionary 
16 for search term correction includes the knowledge of drawing 3 is explained as an 
example. First, if the homonym and similar notation word of the i-th search term are 
taken out from the similar-words knowledge about the i-th search term, the homonym a 
"paddle cough", "analysis", and a "tea lunch" will be obtained from the (knowledge a) of 
drawing 3 (Step B 1). Next, if a "paddle cough", "analysis", or an entry including the word 
of a "tea lunch" is taken out among the coincidence word knowledge / related term 
knowledge about search terms other than the i-th, i.e., "natural language", {(knowledge i) 
the natural language and analysis} of drawing 3 will be obtained (step B-2). And since 
such an entry exists (Y of Step B3), it can judge with the search term "place of meeting" 
being an erroneous input, and being corrected to "analysis" (step B4). 
[0032] Drawing 4 (1) The example of the result of having corrected the search condition 
of - (6) with this method is shown in drawing 6 (1) - (6). Drawing 4 (1) Although - (6) 
was an example of the Boolean search type, it does not depend for this invention on the 
form of a search condition. Drawing 4 (7) The example of the search condition in the 
search system which used retrieval systems other than Boolean search is shown in - (8). 
[00331 Drawing 4 (7) is the example which was going to retrieve the information that 
"natural language" and the word of "analysis" appeared in the same paragraph, and kept 
"analysis" as a "place of meeting" accidentally. On the other hand, drawing 4 (8) is the 
same example that was going to retrieve the information that both "natural language" and 
the word of "analysis" appeared in the 1 st sentence. It can correct like the case of drawing 
4_(1) also in these cases. 

[0034]In addition, it is effective to perform search term correction similarly to search 
which specifies the frequency of a search term, search which specifies the distance 
between search terms, etc. 

[0035](A 2nd embodiment), next a 2nd embodiment of this invention are described. The 
composition of the information retrieval system concerning a 2nd embodiment is shown 
in drawing 7 . As shown in drawing 7. the difference on the system configuration of a 1st 



embodiment and a 2nd embodiment is the point that the latter has the search-results 
evaluation information input part 18. The difference in the flow of processing of a system 
is the point of correcting a search condition only after it refers to the former first using 
the search condition inputted by the latter to search being performed after the inputted 
search condition is corrected, and a user's evaluation information over the search results 
is acquired. Therefore, it is premised on performing re retrieval in this 2nd embodiment. 
Hereafter, only a different point from a 1st embodiment is explained in detail. 
[0036]The retrieval part 14 performs 1st search using the search condition inputted by the 
user, and the search-results outputting part 15 outputs search results to a user. Next, a 
user evaluates whether it used to be in a demand of it, seeing these search results, and 
inputs that evaluation result into the search-results evaluation information input part 18. 
This evaluation information is passed to the search term correction part 12. The search 
term correction part 12 analyzes which information which information suited a user's 
demand among search results, and did not suit, and it amends a search condition based on 
this if necessary. And the retrieval part 14 performs re retrieval using this corrected 
search condition. 

[0037]An example of the flow of processing of the search-results evaluation information 
input part 1 8 of a 2nd embodiment is shown in drawing 8 . The search-results evaluation 
information input part 1 8 receives a user's evaluation about each of the information 
outputted to the user by the search-results outputting part 15 (Step C2 - Step C3). Here, a 
user's evaluation means giving the information whether the retrieved information was 
useful however for the user to a search system. For example, two steps of evaluations 
"this information was not useful although this information was useful", the multi stage 
story evaluation by mark attachment, etc. can be considered. Evaluating to search results 
itself is publicly known art as introduced, for example as "a conformity judging" by 
document (an "information retrieval theory", David Ellis original paper, Kimio Hosono 
supervision of translation, Maruzen), and it is not the main point of this invention. And 
the search-results evaluation information input part 1 8 passes the above-mentioned 
evaluation information to the last at a search term correction part (Step C6). 
[0038]An example of the flow of processing of the search term correction part 12 of a 
2nd embodiment is shown in drawing 9 . The search term correction part 12 receives 
evaluation information from the search-results evaluation information input part 1 8 first 
(Step Dl). And it is analyzed how the search condition was fulfilled about each of the 
evaluated search results (Step D2). The search term which compares this analysis result, 
and the similar-words knowledge, coincidence word knowledge and related term 
knowledge of each search term obtained from the dictionary for search term correction, 
and is considered that the erroneous input was carried out is identified, and this is 
corrected (Step D6 - Step D7). At the end, the corrected search condition is passed to the 
search condition storage parts store 13 (Step D8). 

[0039]An example of the evaluation information passed to the search term correction part 
12 from the search-results evaluation information input part 18 is shown in drawing 10 . 
Here, although the user thought that he inputted the search condition of drawing 6 (6), he 
presupposes actually that the search condition of drawing 4 (6) was inputted for the 
erroneous input. As a result of this searching, three documents should be searched and the 
user should perform two-step evaluation to these. Are 0 about 1 and the case where it 
does not contain, in the case where each document contains the search term, and 1 shows 



the case where each document fulfills a search condition, and 0 has shown the case where 
it does not fill. Since the document 1 included both of words of a "package" and a 
"satellite" by chance, the search condition has been fulfilled, but since it was completely 
unrelated, the "broadcasting satellite", "satellite broadcasting", the "communications 
satellite", and the "satellite communication" for which a user asks in contents are 
estimated not to be "useful." On the other hand, by including the word of 
"communication" and a "satellite", the document 2 and the document 3 fulfill a search 
condition, and are estimated to be "useful" by the user. 

[0040]The processing which the search term correction part 12 performs to below to an 
evaluation result like drawing 10 is explained. In drawing 10, if it asks for the search term 
which did not appear in the document which appeared in the document estimated not to 
be "useful", and was estimated to be "useful", the search term a "package" will be 
obtained. Although the document 1 included the word of a "package" fulfilled the search 
condition, since it was estimated not to be "useful", the search term a "package" may be 
an erroneous input. Henceforth, it is possible by using the dictionary 16 for search term 
correction like a 1st embodiment to correct a "package" with "broadcast." 
[0041]It cannot be overemphasized that the search term correction processing using the 
user's evaluation information in this 2nd embodiment is effective like a 1st embodiment 
in addition to Boolean search. 
[0042] 

[Effect of the Invention] Since the words and phrases by which the erroneous input was 
carried out are automatically detected and corrected according to this invention as 
explained in full detail above, that working efficiency improves by leaps and bounds. 
Since the interpretation can be selected even if the inputted words and phrases are words 
and phrases interpreted by polysemy, it becomes possible to raise the efficiency and 
accuracy of search. The accuracy of re retrieval can be raised now by feeding back 
evaluation of the search results by a user to detection and correction of words and 
phrases. 

DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing l ] The figure showing the composition of the information retrieval system 
concerning a 1st embodiment of this invention. 

[Drawin g 2]T he flow chart which shows an example of the flow of processing of the 
search term correction part of the embodiment. 

[Drawing 3] The figure showing an example of the similar-words knowledge registered 
into the dictionary for search term correction of the embodiment, coincidence word 
knowledge, and related term knowledge. 

[Drawing 4] The figure showing the example of the search condition containing the search 
term which is passed to a search term correction part from the search condition input 
section of the embodiment, and by which the word input was carried out. 
[Drawing 5] The flow chart which shows an example of the erroneous input judging 
algorithm (step A5 of drawing 2 ) of the i-th search term in drawing 2 of the embodiment. 
[Drawing 6] The figure which illustrates the result of having corrected the search 
condition of the embodiment. 



[Drawing 71 The figure showing the composition of the information retrieval system 
concerning a 2nd embodiment of this invention. 

[Drawing 8] The flow chart which shows an example of the flow of processing of the 
search-results evaluation information input part of the embodiment. 
[Drawing 9"j The flow chart which shows an example of the flow of processing of the 
search term correction part of the embodiment. 

[Drawing lOT The figure showing an example of the evaluation information passed to a 
2 rch term correction part from the search-results evaluation information input part of 
the embodiment. 
[Description of Notations] 

10 [ ~ A search condition storage parts store, 14 / - A retrieval part, 15 / - A search- 
results outputting part, 16 / ~ The dictionary for search term correction, 17 / — Search 
information, 18 / - Search-results evaluation information input part. ] ~ An information 
retrieval system, 1 1 ~ A search condition input section, 12 - A search term correction 
part, 13 
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